Auxiliary variables in conditional Gaussian mixtures for automatic speech recognition

نویسندگان

Todd A. Stephenson

Mathew Magimai-Doss

Hervé Bourlard

چکیده

In previous work, we presented a case study using an estimated pitch value as the conditioning variable in conditional Gaussians that showed the utility of hiding the pitch values in certain situations or in modeling it independently of the hidden state in others. Since only single conditional Gaussians were used in that work, we extend that work here to using conditional Gaussian mixtures in the emission distributions to make this work more comparable to state-of-the-art automatic speech recognition. We also introduce a rate-of-speech (ROS) variable within the conditional Gaussian mixtures. We find that, under the current methods, using observed pitch or ROS in the recognition phase does not provide improvement. However, systems trained on pitch or ROS may provide improvement in the recognition phase over the baseline when the pitch or ROS is marginalized out.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Auxiliary Variables in Conditional Gaus Speech Recogni

متن کامل

Improving the performance of MFCC for Persian robust speech recognition

The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...

متن کامل

Auxiliary Variables in Conditional Gaussian Mixtures for . . .

متن کامل

Using Gaussian Mixtures for Hindi Speech Recognition System

The goal of automatic speech recognition (ASR) system is to accurately and efficiently convert a speech signal into a text message independent of the device, speaker or the environment. In general the speech signal is captured and pre-processed at front-end for feature extraction and evaluated at back-end using the Gaussian mixture hidden Markov model. In this statistical approach since the eva...

متن کامل

Mixed Bayesian Networks with Auxiliary Variables for Automatic Speech Recognition

In standard automatic speech recognition (ASR), hidden Markov models (HMMs) calculate their emission probabilities by an artificial neural network (ANN) or a Gaussian distribution conditioned only upon the hidden state variable. Recent work [12] showed the benefit of conditioning the emission distributions also upon a discrete auxiliary variable, which is observed in training and hidden in reco...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2002

Auxiliary variables in conditional Gaussian mixtures for automatic speech recognition

نویسندگان

چکیده

منابع مشابه

Auxiliary Variables in Conditional Gaus Speech Recogni

Improving the performance of MFCC for Persian robust speech recognition

Auxiliary Variables in Conditional Gaussian Mixtures for . . .

Using Gaussian Mixtures for Hindi Speech Recognition System

Mixed Bayesian Networks with Auxiliary Variables for Automatic Speech Recognition

عنوان ژورنال:

اشتراک گذاری